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Abstract 

Analysis of (partial) groundness is an important application of abstract in- 
terpretation. There are several proposals for improving the precision of such an 
analys is by ex ploiting type information, icluding our own work with Hill and 
King [SHKOOl, where we had shown how the information present in the type 



declarations of a program can be used to characterise the degree of instantiation 
of a term in a precise and yet inherently finite way. This approach worked for 
polymorphically typed programs as in Godel or IfAL. Her e, we recast this ap- 
proach following Codish, Lagoon and Stuckey [CLOO, LSOl]. To formalise which 



properties of terms we want to characterise, we use labelling functions, which are 
functions that extract subterms from a term along certain paths. An abstract 
term collects the results of all labelling functions of a term. For the analysis, 
programs are executed on abstract terms instead of the concrete ones, and usual 
unification is replaced by unification modulo an e quality theory which includes 
the well-known ACI-theory. Thus we generalise ||CLOC , LSOl] w.r.t. the type 
systems considered and relate those two works. 
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1 Introduction 

Analysing logic programs for properties such as sharing and (partial) groundness is 
important in compiler optimisations and program development tools. Programs are 
usually analysed using abstract interpretation CC77[. In this paper, we consider in 
particular the framework of abstract compilation |CD95|: a program is abstracted by 
replacing each unification with an abstract counterpart, and then the abstract program 
is evaluated just like a concrete program. 

It has been recognised for some time that abstract interpretation can be used for 
type analysis, and conversely, that type information available a priori can improve the 
precision of other analyses |BM95|, |CE>94 |CLOO|, pW94|, [IB92|, |VCL93|. For example. 



being able to say that [1,X] is a list skeleton with possibly uninstantiated elements is 
more precise than only being able to distinguish a ground from a possibly non-ground 
term. Underlying all those works is a descriptive view of types: types are not part of 
the programming language (in particular, no program is rejected for not being "well- 
typed"), but rather introduced to analyse an arbitrary, say Prolog, program. In such 
approaches, it is natural that there is no sharp line between type analysis and mode 
(groundness, instantiation) analysis. For example, saying that a term is a list has 
two aspects: it is a list as opposed to, say, an integer; it is a list, as opposed to an 
uninstantiated variable. 

Underlying this paper is a prescriptive view of types, i.e., types are a part of the 
programming language. We analyse programs written in typed logic programming 

*CWI, Amsterdam, The Netherlands, jan. smausScwi .nl 
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languages such as Godel |HL94|] , HAL iDMGMS99|| , or Mercury lSHC96| . This im- 
pUes that the types do not have to be analysed s ince t hey are given beforehand by 
declarations or inference. In particular, unlike e.g. CLOO |, we do not have to deal with 
"ill-typed" terms such as [1|2], since these can never occur. 

We are aware of only two other works f ollowing this view: our own | SHKOO | and the 
recent work by Lagoon and Stuckey | LSOl | . This paper is a synthesis of those two works 
and |CLOO|, which, although designed for a descriptive view of typing, can be adapted 
to prescriptive typing]^ The generalisation w.r.t. [CLOC, LS01| concerns polymorphism, 
which is disregarded in |LS01 | and considered in CLOO| only in a restricted form. We 
recast our own previous work using some aspects of their formalisms. In particular, 
the use of the notions of grammar and variables labelling non-terminals [LSOl should 
improve the understanding of what properties of terms our analysis captures, whereas 
the use of ACI- unification |CLOO| may provide the basis for an implementation using 
well-studied algorithms. Also, we hope that our work will prove to be applicable to 
analyses previously not envisaged by us, such as sharing analysis |LS01|. 

In the intuitive explanations that follow, we refer to a set of possible characterisa- 
tions of the instantiation of a term as abstract domain. 

The standard example to illustrate the benefits of an instantiation analysis using 
types is the ubiquitous APPEND program. For example, for the query append([A], [B], C), 
a typed analysis is able to infer that any answer substitution will bind C to a list 
skeleton. However, this example is unfit to explain the advance of this paper over 
previous works. 

We therefore give another example. A table is a data structure containing an 
ordered collection of nodes, each of which has two components, a key of type string, 
and a value, of arbitrary type. That is to say, the type constructor table is para- 
metrised by the type of the values. For any type r, table(r) is the type of table s 
whose values have type r. Tables are implemented in Godel as an AVL-tree [ Emd81 |: 
a non-leaf node has a key argument, a value argument, arguments for the left and right 
subtrees, and an argument which represents balancing information. For a term of type 
table(r), our abstract domain characterises the instantiation of all key arguments, all 
value arguments, and all the arguments representing the balancing information. 

The characterisation of the instantiation of the value arguments depends on r. 
Hence, our analysis supports parametric polymorphism. In devising an analysis for 
polymorphically typed programs, there are two main problems: the construction of 
an abstract domain for table(r) should be truly parametric in t, and the abstract 
domains should be finite for a given program and query. We only briefly illustrate 
what these points mean here. Explaining why these requirements are non-trivial is 
technically too involved for this introduction. 

The statement that the construction of an abstract domain for table(r) is truly 
parametric in r, means, for example, that the abstract domain for table(str) relates 
to str in exactly the same way as the abstract domain for table(int) relates to int^. 
This implies that the abstraction of a table can be defined in a generic way. 

Lagoon and Stuckey formalise types as regular tree grammars. Each type is identi- 
fied with a non-terminal in the grammar, and it is assumed that there are only finitely 
many types. Finiteness is crucial for the termination of an analysis. When we ex- 
tend this approach to polymorphism, finiteness becomes a problem, since there are 
infinitely many types, e.g. list(int), list(list(int(int))), .... Nevertheless, under 
certain conditions, it can be ensured that for a given qu ery an d program, there are 
only finitely many types. Note that this is in contrast to ||CLOC | where it is proposed 
that termination of analyses of polymorphic programs should be enforced by imposing 



^The journal article [CLO0| is based on an earlier article [ 3L96 . However there are some interesting 



differences, and therefore we will also sometimes refer to the earlier article. 
^ We abbreviate string by str and integer by int. 
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an ad-hoc bound on the depth of types. 

The rest of this paper is orga nised as follows. The next section provides some 
preliminaries. In Sec. ^ following LSOl ], we show how the type of a term gives rise 
to c haract erising its degree of instantiation in a structured way. In Sec. [J, follow- 
ing [ CLOC |, we define abstract terms based on the ACIl equality theory. In Sec. ^, 
we formalise how abstract terms capture the degree of instantiation of concrete terms, 
thereby linking the two preceding sections, and also linking | LS01 | with | CLOO . Sec- 
tion ^ lifts the abstraction of terms to an abstraction of programs, and relates the 
semantics of a concrete program and its abstraction in the sense of abstract inter- 
pretation. Section ^ makes some comments on a possible future implementation, and 
Sec. ^ discusses our results. 



2 Preliminaries 

The reader is assumed to be familiar with the basics of logic programming 



We use a type system for logic programs with parametric polymorphism [DMGMS99 
HL9^ , [SHC96| 



Let /C be a finite set of (type) constructors, each c € /C with an arity n > 
associated (by writing c/n), and U he a denumerable set of parameters. The set of 
types is the term structure T{IC,U). A type substitution is an idenipotent mapping 
from parameters to types which is the identity almost everywhere. We define the 
order -< on types as the order induced by some (for example lexicographical) order on 
constructor and parameter symbols, where parameter symbols come before constructor 
symbols. 

The set of parameters in a syntactic object o is denoted by pars{o). Parameters are 
denoted by u,v, in concrete examples by U, V. A tuple of distinct parameters ordered 
with respect to ~< is denoted by u,v. 

Let V be a denumerable set of variables. The set of variables in a syntactic object 
o is denoted by vars{o). Variables are denoted by x,y, in concrete examples by X,Y. 
A tuple of distinct variables is denoted by x,y. 

A variable typing is a mapping from a finite subset of V to T{IC,U), written as 

. Tl , . . . , Xji . T^i}. 

Let T (resp. V) be a finite set of function (resp. predicate) symbols, each with an 
arity and a declared type associated with it, such that: for each / S JF, the declared 
type has the form (ri, . . . , t„, r), where n is the arity of /, (ri, . . . , r„, r) € T{JC, Z//)"^^, 



and T satisfies the transparency condition |IIT92|: pars{Ti, . . . , r„) C parsir); for each 



p G P, the declared type has the form (ri, . . . ,r„), where n is the arity of p and 
(ti,...,t„) e T(/C,Z//)". We often indicate the declared types by writing fr^,,,r„^T 
and Pti...t„ ■ 

Throughout this paper, we assume /C, !F, and V arbitrary but fixed. The typed 
language, i.e. a language of terms, atoms etc. based on /C, T, and P, is defined by 
the rules in Table |^. All objects are defined relative to a variable typing F, and _ h . . . 
stands for "there exists F such that F h . . . " . Actually, we will rarely refer to the type 
system explicitly, but it should be noted that any objects we will come across in the 
context of analysing a typed program will be correctly typed according to thos e rules . 
This is guaranteed because typed programs ha ve the subject reduction property [ [IT92|] . 



Concerning semantics, we entirely follow [ CLOO |. The set of atoms is denoted by 
B, and elements of 2^ are called interpretations. For a syntactic object o and a 
set of objects /, we denote by (Ci, . . . , C„) <^o I that Ci, . . . , C„ are elements of / 
renamed apart from o and from each other. So the analysis we shall propose is generic 
and independent of any particular (say top-down or bottom-up) concrete semantics, 
but examples will be given using the s-semantics, i.e. the semantics based on the 
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Table 1: Rules defining a typed language {Q is a type substitution) 
(Var) {x : T, . . .} \- X : T 



(Func) 



rhtune - rht„:T„e 



Ti...Tn- 



^r{ti,...,tn):TQ 



, ri-fi:ri9 ■■■ ri-^n:Tn9 
( om) Yhpr^,,,rn{ti,...,tn) Atom 



(Head) 



-'Ti...Tn \ 



ri-pri...T„(tiv,^n) Head 



( Query) 
( Clause) 
( Program ) 



T^Ai Atom ■■■ ThAn Atom 
ri-Ai,...,A„ Query 

rhA Head T^Q Query 
rhA^Q Clause 

_hCi Clause ■■■ -l~C„ Clause 
_l-{Ci,...,C„} Program 



non-ground Tp-operator, defined as follows: 

Tp{I) ^{H0\C^H^ B,, . . . ,B„ e P, (Ai, . . . , A.) «c 
= MG;7((Bi,...,B„),(Ai,...,^„))}. 

We denote by [PJ^ the least fixpoint of Tp. 

We denote by ti < t2 that ti is an instance if t2- The domain of a substitution 6 
is denoted as dom{9). 



3 The Structure of Terms and Types 

In this section, we show how the type of a term gives rise to a certain way of charac- 
terising its structure, and in particular, how its degree of instantiation can be charac- 



terised in a structured way. We alternate between recalling the formalism of |LS01| 



and adapting it to polymorphic types, thereby linking to |SHKOO| 
3.1 Regular Types ILSOT 



Definition 3.1 A top-down deterministic finite tree automaton (top-down 
DFTA) is a tuple A = (go, Q, A), where Q is a set of states, go G Q is an initial state 
and A is a set of transition rules of the form q{f{xi, . . . , a;„)) f{qi{xi), . . . , (7„(x„)), 
such that no two rules have the same left-hand side. 

Top-down DFTA's accept the class of languages called regular types. 

Definition 3.2 A regular tree grammar is a tuple Q = (S", M^, E, A), where W is 
a finite set of non-terminal symbols, S" G is a starting non-terminal, A is a set 
of productions in the form X .f(Yi, . . . , Yn) s.t. X,Yi, . . . ,Yn G W and f /n G S. 
A regular tree grammar is deterministic if for any non-terminal X and any two 
productions X — > /(Yi, . . . , Yn) and X g{Zi, . . . , Zm), we have //n ^ g/ra. 

It has been pointed out that the two formalisms above define the same class of 
languages. Transitions of the automaton can be converted to grammar productions 
and vice versa by identifying each non-terminal symbol with a corresponding state of 
the automaton. 

Example 3.3 The DFTA {qL, {qL, Qe}, {nil/0, cons/2, a/0, b/0}. A), where 

A = {(7L(nil) ^ nil, gL(cons(a;, y)) ^ cons((7£;(a;), gL(y)), g£;(a) ^ a, g£;(b) -> b}, 

accepts ground lists of a's and b's. 

The grammar L — > nil] cons (i?, L), E a|b defines the same language. 
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Automata Grammars Types 



Graphs 



state non-terminal type node 

transition rule production « type declarations edge 



Table 2: Gorrespondences between formalisms 



The equivalence of the two formalisms allows us to represent derivations of the 
grammar as transitions of a deterministic rewriting system. These transitions have 
the form N{f{ti,...,tn)) ^ /(iVi(<i), . . . , 7V„(t„)), where N /(A^i, . . . , iV„) is a 
production of the grammar {n > 0). Using this notation, we say that the grammar 
Q = {S, W, S, A) accepts a term t if S{t) — >* t. Sometimes we are interested in 
a particular segment of a single path in a derivation tree starting from root S and 
reaching a non-terminal N with a subterm t' of t, i.e., in derivations S{t) — >* s[N{t')], 
where the notation s[N{t')] means that s has N{t') as a subterm. Abusing notation, 
we write S{t) N{t') in this case. 

Example 3.4 Given the grammar in Ex. |3.3| , we have 

L(coiis(a, nil)) —>■ cons(_B(a), L(nil)) cons(a, L(nil)) — > cons(a, nil). 

We also write L(cons(a, nil)) E(&} and L(cons(a, nil)) L(nil), using the 
above notation. 

This notation can also be applied to non-ground terms. For example, we have 
L(cons(a,Y)) E{a) and L(cons(X,Y)) ^* L(Y). 

It is also convenient to depict a grammar graphically as a 
type graph, defined previously as a gra ph whose nodes are la- 



belled with types or functions | VGL93 |. We simplify that defi 



L 


J 


E 









Figure 1: Type graph 



nition by leaving out the function nodes. Thus a type graph for 
G = {S, W, S, A) is a directed graph whose nodes are labelled 
by non-terminals, and there is an edge from N to N' if and only if there is a production 
N f{. . . , N', . . .) is Q. We call the node labelled S the starting node. Figure Q 



shows the type graph for Ex. 3.2 



3.2 Regular Types and Polymorphism 

Gonverting the type declarations of a typed language such as Mercury into grammar 



rules has been considered straightforward [LSOl, footnote 1]. This seems justified, al 



belt only in the absence of polymorphism. Since /C is a finite set of type constants, we 
can identify each type constant with a non-terminal, and each function /ri...T„-»r G 
is translated into a production r /(n, • • ■ , t„). In that way, each type (constant) 
corresponds to a grammar with that type as starting non-terminal.^ Table sum- 
marises the correspondences between the four formalisms we effectively identify in this 
paper. 

Note that in Sec. H, we have specified that each / has exactly one declaration; in 
other words, there is no overloading. This is a sufficient condition for the grammar to 



be deterministic. One could allow some overloading, by specifying: if fr 



E T 



and /cri,....cr„^(T G T and fTi...T„-*T 7^ fai,...,a,„^<7: then either t ^ a, or n ^ m. This 
would strictly include the overloading allowed in Godel. We prefer however to disallow 
overloading to avoid unnecessary complication. 



Although one might be confused by the fact that |LS01| also says that there is a grammar for 



each program variable, but this is simply a matter of renammg. 
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c(U) 
I 



c(c(U)) 



c(c(c(U))) 



k(U) 



k(str) 



str 



list(nest(V)) 



list(U) I I lie St (V) 





int 




U 




V 



table (U) 
I 



str 



bal 



(a) (b) (c) LISTS NESTS TABLES 

Figure 2: Some type graphs, with starting node highhghted 



We now give a pseudo-definition of a grammar corresponding to a polymorphic 
type — "pseudo" because the set of non-terminals may be infinite. It is trivial to 
generalise the definition of a grammar to infinite sets of non-terminals, but in the end, 
it is not desirable, since it would undermine our goal of characterising (approximating) 
properties of a term of arbitrary type in a finite way. We will later impose a condition 
to ensure finiteness. 



Definition 3.5 Consider a typed language given by 1C,!F and a type 4>. The gram- 
mar corresponding to 0, denoted ^(0), is the grammar ('0', .F, A), where W is 
inductively defined as follows 

• '(/)' e W, 

• fTi...T„~*T G J' and 'r6' G W for some type substitution 9 implies 
'rie',...,'T„e'eT4^, 

and A - {'rO' ^ /ri...r„^r ('nO', . . . , 'T„e') I re G W}. 

We put types in quotation marks to indicate that when looking at the grammar, 
types are just non-terminal symbols. Type graphs are defined as before. In Fig. |^, we 
give some type graphs to which we will refer frequently. 

It is also useful to have names and a notation for the relations holding between the 
types in a type graph. 

Definition 3.6 A type cr is a direct subterm type of cj) (denoted as fi < 0) if there 
is /Ti...r„^r G J- and a type substitution 6 such that tQ = (j) and TjO = a for some 
i e {1, . . . , n}. The transitive, reflexive closure of < is denoted as <* . If a <* 0, then a 
is a subterm type of 0. 

We now discuss two problems related to the generalisation to polymorphism, in- 
cluding that of flniteness mentioned above. 

Example 3.7 Whenever we give a particular typed language, K, is given implicitly as 
the set of all type constructors occurring in the type subscripts in J-. 

One would hope that even if a typed language contains an infinite set of types, the 
type graph taking a fixed type as starting node should be finite. However, consider 
T — {f c(c(u))-^c(u)}- The type graph of c(U) is shown in Fig. || (a). As can be seen, it 
is infinite. 

We impose the following restriction on any typed language to ensure finiteness. 

Reflexive Condition: For all c S /C and types a — c{a), t — c(f), if a <f r, then a 
is a sub "term" (in the syntactic sense) of r. 
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Clearly, this condition is violated by the example above, where c(c(U)) < c(U). With 
this condition in place, it is easy to see that any type graph for a given starting node 
is finite. 

In the introduction we mentioned another problem, namely that the construction 
of abstract domains should be "truly parametric" . 

Example 3.8 Consider T ~ {fu^k(u)i gint-fk(str)}- Figure || (b) shows the type graph 
for k(U). Essentially, the type graph for any instance k(r) is obtained by replacing the 
node k(U) with k(T) and the node U with the type graph for r. However, there is one 
exception to this: if r = str, then the type graph is the one shown in Fig. ^ (c). For 
this example, it would clearly be wrong to say that "k(U) relates to U in the same way 
as k(str) relates to str". 

Again, we rule out this anomaly. First we define: 

Definition 3.9 A fiat type is a type of the form c(u), where c E IC. 

We now impose the following condition on any typed language. 
Flat Range Condition: For all fTi...Tri^T G .F, r is a flat type. 

In Mercury (and also in functional languages such as ML or Haskell) , this condition 
is enforced by the syntax. In Godel, it is possible to violate the condition, but this can 
be regarded as an artefact of that syntax. 

Thus we assume from now on that any typed language we consider meets the two 
conditions above. 



3.3 Labelling [ |LS01 



Labellings can be used to characterise the degree of instantiation of a term taking its 
type into account, i.e., analyse a term on a per-role basis [LSOl]. 



Definition 3.10 A variable x in a term t labels a non-terminal iV of a grammar Q if 
S{t) — >* N{x), where S is the starting non-terminal of G- 

We denote by ({S, N, t) the function which returns the set of variables x such that 
S{t) ^* N{x) (one could also write C{G,N,t) |LSOl| ). 



Example 3.11 The grammar LL nil|cons(L, LL), L nil|cons(i?, L), E 
a|b accepts ground lists of lists of a's and b's. Note that the use of cons and nil 
could be regarded as overloading, but this is not forbidden by |LS01 | as it is not in 
contradiction to the grammar being deterministic. 

We use the usual list notation for ease of reading. The type 
graph of LL is shown in Fig. ^. We are interested in the la- 
belling of all non-terminals reachable from LL. Let t = [[a], [b]]. 
Then C{LL,E,t) = ((LL^L.t) = C,{LL,LL,t) = 0. Now 
let t = [[a],[X]]. Then C{LL,E,t) = {X} and C{LL,L,t) = 
CiLL,LL,t) = 0. Now let t = [[a],X]. Then C{LL,E,t) = 0, 
C(LL,LL,t) = 0. 



LL 



L 



E 



Figure 3: List of lists 

C{LL,L,t) = {X} and 



3.4 Labelling and Polymorphism 

We now want to adapt the idea of labelling to the case when we have polymor- 
phism. With polymorphism, one can have infinitely many types, and even though 
the type graph for a fixed type as starting node is finite, it can become arbitrar- 
ily large, i.e., it can have an arbitrary number of non-terminals reachable from the 
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starting node. Also, it would clearly be desirable to describe the labellings for say 
list(int), list(list(int)), ... in a uniform way. This motivates defining a hierar- 
chy in the type graph. 

Definition 3.12 A type cr is a recursive type of (j) (denoted as cr cxi (^) if a<i* 4> and 
4><i* a. We write 1X1(0) for the tuple of recursive types of (p other than itself, ordered 
by -< (see Sec. ^). 

A type (T is a non-recursive subterm type (NRS) of (denoted as a <3i (j)) if 
4> a and there is a type r such that a <t and r cxi 0. We write ^(0) for the tuple 
of non-recursive subterm types of 4>, ordered by ^. 

It follows immediately from the definition that, for any types (fi, a, we have (j) K cj) 
and, if (T ^ (/), then a cf). Consider the type graph for (j). The recursive types of 
(j) are all the types in the strongly connected component (SCC) containing (j). The 
non-recursive subterm types of are all the types a not in the SCC but such that 
there is an edge from the SCC containing to a. 

Example 3.13 Consider Fig. |[ Let JIists = {nil^iist(u), consu,iist(u)^iist(u)}- We 
have list(U) ixi list(U) and U « list(U). 

Let .Fnests = - ^LisTs U {sy^nest (v) , Hiist (nest (V) ) ^nest fv) } ■ The NESTS language imple- 



ments rose trees [Mce88], i.e., trees where the number of children of each node is not 
fixed. We have list(nest(V)) M nest(V) and nest(V) cxi nest(V) and V <a nest(V). 
Suppose .FsTRiNGs contains all strings with str as type subscript. Let J-jables = 

.^^STRINGsU 

{ih^bal, rh^bal, eq^bal, Ilull^table(U)j I10detable(U),str,U,bal,table(U)— »table(U)}) 

The type bal contains three constants representing balancing information. We have 
table(U) CXI table(U) and <3a(table(U)) = (U, bal, str). 

An NRS of a flat type is often just a parameter of that type, as in U <S] list(U). 
However, this is not always the case, as witnessed by str <si table(U). 

Instead of looking at the labellings of all non-terminals reachable from some start- 



ing node without distinction |LS01|, we classify them according to the recursive types 
and the NRSs of that node. This will be reflected in the construction of abstract 
domains. 



Definition 3.12 is obviously appl icable in particular in the monomorphic case and 
thus to the grammars as in LSOl ]. Figure || shows that the recursive types and 



NRSs may not be all types reachable from a sta rting node. In that example, we have 



LL CXI LL and L -ti LL. In the approach of [ LSOl ], we may also be interested in 
C{LL, E, t) for some term i, so in the labellings of E. In the approach proposed here, 
the domain construction for LL depends on E only indirectly, via the abstract domain 
for L. Without such an inductive approach to domain construction, we would not 
know how to deal with polymorphism. 

The key to devising a "parametric" abstract domain construction is to focus on type 
constructors, or equivalently, on flat types c{u). So for example, we should focus on 
list(U) and not a particular instance such as list(int). This may not be surprising, 
but it has two consequences which may not be obvious. 

First, note that the relation « is not stable under instantiation of types. This 
can be seen by comparing LISTS with NESTS. We have U « list(U), but nest(V) cx] 
list (nest (V)). The abstract domain for list (nest (V)) however, being derived from 
the abstract domain for list(U), must relate to nest(V) as if nest(V) was an NRS 
of list(nest(V)). In contrast, the abstract domain for nest(V) must reflect that 
list(nest(V)) cx] nest(V). One could paraphrase this by saying: LISTS does not know 
about NESTS, but NESTS does know about LISTS. 
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The second consequence can be illustrated using TABLES. The type table(U) has 
three NRSs, and each of them will be dealt with in the construction of the abstract 
domain. However, the instance table(string) has only two NRSs, as U becomes 
instantiated to string. The domain for table(r) will be based on assuming three 
NRSs, i.e., it will deal with the value and the key arguments separately, even if by 
coincidence r is equal to string. 



We now define a function Z in analogy to C. In the approach of [LSOl , we could 
safely identify a grammar with its starting non-terminal. In what follows, we will 
always assume a grammar G{<l>) where <j> is flat (Dcf. 3.5). However, it is also useful 
to consider productions of that grammar starting from some other non-terminal than 
the "official" starting non-terminal. Therefore Z has four arguments, the additional 
first one specifying the grammar and the second the starting symbol. 

Unlike (, the function Z also collects non- variable terms. 

Definition 3.14 Let (phe a flat type, r be a type such that r cxi 0, and a a type such 
that either a txs 4> or a <Si (p. 

We denote by Z{(j), t, a, t) the function which returns the set of all terms s such 
that 'T'(t) — >* '^'(s) in the grammar Q{(l)). 

The function Z is lifted to sets (in the fourth argument) in the obvious way. 

Example 3.15 Let T = ^lists U {achar, bchar}- We have 



Z(list(U), list(U), list(U), [a, X]) 
Z(list(U),list(U),U, [a,X]) 

Z(list(U), list(U), list(U), [[a], [X]]) 
Z(list(U),list(U),U,[[a],[X]]) 

Z(list(U),list(U),list(U), [[a]|X]) = {[[a]|X],X} 
Z(list(U),list(U),U, [[a]|X]) = {[a]}. 



{[a,X],[X],[]} 
{a,X} 

{[[a],[X]],[[X]],[]} 
{[a],[X]} 



Note that unlike C (Ex. 3.11), Z cannot be used to extract from the term [[a], [X]] the 
subterm X directly. 

Now consider the NESTS example, assuming that J-nests is augmented with the 
integers lint, • • • • We have 



Z(nest(V), nest(V), nest(V), n([e(7)])) 
Z(nest(V),nest(V), list(nest(V)), n([e(7)])) 
Z(nest(V), nest(V), V, n([e(7)])) 

2^(nest(V), list(nest(V)), nest(V), [n([e(7)]) 
Z(nest(V), list(nest(V)), list(nest(V)), [n([e(7)]) 
2^(nest(V), list(nest(V)), V, [n([e(7)]) 

Z(list(U), list(U), list(U), [n([e(7)]) 
Z(list(U),list(U),U,[n([e(7)]) 

Z(nest(V),nest(V), V, e(7)) 



{n([e(7)]),e(7)} 

{[e(r)],[]} 

{7} 

) = {n([e(7)]),e(7)} (1) 
) = {[n([e(7)])],[e(7)],[]}(2) 

) - {7} (3) 

) = {[n([e(7)])],a} (4) 

) = {n([e(7)])} (5) 

{7} (6) 



Note the difference between the labellings obtained for [e(7)] depending on whether 
we use the grammar for nest(V) (^ H, ||), or the grammar for list(U) (^, ||). 
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Lemma 3.16 Let (f) and r be flat types such that rO ixi tp for some Q. Let t 
/Ti...T„^r (ii, • ■ • , tn) be a term such that _ h t : r08' for some Q' . Then 



{t}U 



[j Z{(j),nQ,a,U) if CXI 0, o- 7^ 
(J Z{(f>,TiQ,a,ti) if o- = 0. 



Proof. FoUows from the fact that for each i e {1, . . . , n}, we have 'r0'(t) ^ 'ri9'(ii) 
in 0(0). □ 



4 Abstract Terms 

In this section, we define an abstraction of terms using the notions of recursive type 
and non- recursive subterm type. This amounts to a generahsation of ]CLOC |. 



4.1 Abstract Domains and Terms 

We first introduce the formahsm of set logic programs shown to be powerful for program 



analyses [LSOl]. Consider a language based on a set of variables V and a set of functions 
J-® = {0,®}, where 0/0 represents the empty set and ©/2 is a set constructor. Set 
expressions are elements of the term algebra T{J-® , V) modulo the ACIl equality 
theory, consisting of: 

{x (By) (B z = X (B {y (B z) (associativity) x ® x — x (idempotence) , , 
X (B y = y ® X (commutativity) x © ~ x (unity) 



Lagoon and Stuckey [LSOl] now proceed by regarding each non-terminal in the gram- 
mar corresponding to a variable x as an abstract variable. The instantiation of that 
abstract variable obtained from the execution of the abstract program gives us infor- 
mation about the labels of the non-terminal after the concrete execution. 

We do not see how this approach could work in the presence of polymorphism. 



Instead, we follow JCLOCt . We now introduce new function symbols, one for each 



type constructor c G /C, in addition to and ©. These are used to collect the informa- 
tion corresponding to the different non-terminals in a structure, which we call abstract 
term. The arity of is given by the arity of <Si{c{u)) plus the arity of ixi(c(u)). 

Definition 4.1 We define 

:= T® U {c-^/m | c e /C, #(<S](c(S))) -f #(M(c(S)))}. 

Now let r = c{u), «(t) {pi, . . . , pm'), and IX1(t) = {pm'+i, . . . , Pm)- For a term 
t = /ri...T„-^r(ii, • • ■ , tn), wc define 



a{t) = 



( «(i^),---, c^iU)] ® 0a(iz). 

\Ti— pi Ti—pm / Ti—T 



For a variable x we define a{x) — x. We call the image of a the domain of abstract 
terms, or simply the abstract domain. 
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In iCL96| , the abstraction function is denoted type, and it is essentially a special 



case of the above definition for =f^{^{c{u))) = 1 and 7^(IXl(c(w))) = 0. Using our 
terminology, they assume that there is a function, which they denote by a, returning for 
each /...^c(ii) the type constructor c; moreover, there is a function tt returning the set of 
argument positions of / that have declared type c(u), and all other argument positions 
are assumed to have declared type u. However, since their typing is descriptive, a and 
TT have to be provided by the user. 



Example 4.2 Consider again Ex. 3.15 



a{7) = int-^ 

a([7]) = list-^(a(7)) ® a(nil) = list-^(int-^) © list-^(0) 

a(e(7)) = nest-^(a(7),0) = nest-4(int-4, 0) 

a(n([e(7)])) = iiest-^(0, a([e(7)])) = nest-4(0, list-^(nest-^(int-^. 

Note how it comes into play that the empty ^-sequence is naturally defined as 
the neutral element 0. There is a notable difference between [ CL96 and [CLOO| at 



this point. In the latter, there is no neutral element. This means in particular that 
nil cannot be abstracted as list(0). Instead, it is abstracted as nil, and as a con- 
sequence, the list [7] is abstracted as list(int) ©nil. While it is argued that such 
an abstraction simplifies the implementation of abstract unification, we believe that 
an object list(int) © nil mixes types/abstract terms on the one hand and concrete 
terms on the other hand in a way which is undesirable. 

In fact, from the design of our abstract domains and the fact that we are analysing 
prescriptively typed programs, it follows that whenever an expression c"^(. . . )©c'"^(. . . ) 
occurs, then c — c' . This also explains why in the definition of a, the abstraction of 
those ti such that cxi r but Ti ^ t is included in reserved argument positions of 
c-^(. ..), whereas the abstraction of those ti such that = r is directly conjoined 
(using ©) with the whole expression c'^(. . . ). 

Looking at Ex. n3, one might expect that list'^(int'^) ffi 113^^(0) can be sim- 
plified to list'^(int^. Maybe less obvious, one might also expect that the abstract 
term nest'^(0, list-^(nest'^(int'^, 0))) can be simplified to nest'^(int'^, 0). We now 
extend ACIl by further axioms for this purpose. 

Definition 4.3 For each c-^/m G JF-^, the distributivity axiom is defined as follows: 

C-^(xi, . . . ,Xm) © C^iVl, ■ ■ • ,ym) = C^^ixi © ?/i , . . . , X„ © y,n) (8) 

Moreover, consider a flat type = d{v) such that ^(0) = {ai, . . . ,ai'), = 
(cTj'+i, . . . , ai). For each j e {I' + 1, . . . , I}, we have Cj = tQ for some fiat type 
T = c{u) and some 8. Suppose «(r) = {pi, . . . , pm'), ^{t) = {pm'+i, ■ ■ ■ , Prn), We 
define the extraction axiom for d-^ and aj as follows: 

d-^{xi, . . . ,Xj-i,c'^{yi, . . ■,ym) © Xj,Xj+i, ...,xi) = 

d-^ Ui © yfc, . . . , Xj-i © yfc, Xj, Xj+i © yfc, . . . , © 
\ Pfc0=o-i pfce=CTj_i pj.e=crj+i p^s=a-i j 

© Vk- 

Let ACIIDE be the theory given by the axioms in (Q) and the distributivity and 
extraction axioms. We abbreviate ACIIDE by AC+ and denote equality modulo 
AC-f as =Ac+- 

Note that applying a distributivity or extraction axiom from left to right decreases 
the number of occurrences of function symbols by 1 . 
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Example 4.4 Consider LISTS, NESTS, respectively. The extraction axiom for nest(V) 
and list(nest(V)) is 

nest'^(a;i, list'^(j/) © X2) = nesf^(a;i, a;2) ® y. 

In AC+, we have 

list-^(int-^) © lisf^(0) =:ac+ list-^(iiit-^ ® 0) =ac+ list-^(int-^) 
nest-^(0, list'4(nest-^(int-^, 0))) =ac+ nest-4(0, 0) ® nest-^(int-^, 0) =ac+ 

nest-^(int-^,0). 



In |CLOO|, it is mentioned that one might have chosen to add a distributivity axiom 
bu t the a uthors argue the case for not doing so. The extraction axiom has no equivalent 
in 



I CLOO 



4.2 Normal Abstract Terms 



Following |CLOO|, we have defined the abstraction by structural induction on a term. 
This definition is a good basis for our analysis, but it is still unsatisfactory: as it is 
stated (i.e. without applying any axioms), even for ground terms, the abstraction of a 
term is proportional in size to the term itself; consequently, the abstraction is not in 
a form that makes it convenient to read any properties of the concrete term from it. 

We first show that using AC+, any abstract term can be converted in a normal form. 
To this end, it is useful to view abstract terms as typed terms according to the rules 
of Table ^ For each t = c{u) with <S](r) = {pi, . . . , pm') and [x1(t) = (pm'+i, • . • , Prn), 
we declare cf^^,,,^p^^r- Moreover, we declare 0^ti and ®u,«^u.Q Those declarations 
are designed exactly so that the following proposition holds. 

Proposition 4.5 The <si- and ixi-relations based on the declared types of the "abstract 
functions" in are the same as the <»- and cxi-relations based on T. 

To distinguish type judgements in the concrete and abstract language, we use 
for the latter. The following proposition says that the abstraction of a term has 
the same type as the concrete term itself. Its proof is straightforward by structural 
induction. 

Proposition 4.6 If Fh i : r then F a{t) : r. 

The next lemma says that application of the equality axioms preserves well-typing. 
The interesting axiom is the extraction axiom. 

Lemma 4.7 If F |— ^ a : f and a =ac+ b then F h-^ & : f . 



Proof. We only consider the extraction axiom. Assume the notations of Def. 4.3, 
and consider an abstract term 

a = d-^{bi, . . . ,6j_i,c'^(ai, . . . ,a„) ® ... ,6;) 

where _ a : (j)Q' for some 9' (note that the typing rules are such that a must have 
a type that is an instance of 0, but not necessarily (j) itself) . By the rules of Table |l|, 
we must have _ h"^ bj' : aj'Q' for all j' S {1,...,/} and _ h"^ aj, : p^OO' for all 
/c G {1, . . . , m}. Therefore it follows that for j' G {1, . . . , ^} \ {j}, we have 

: cTj'Q', and we also have _ h"^ a^^, : 09'. 




^These declarations violate the Simple Range Condition, and in any case would not be permissible 
in any existing typed programming language since a range type must not be a parameter. However, 
this causes no problems for our theoretical considerations. 
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This implies that if b is obtained from a by applying the extraction axiom in the jth 
position, then _ h- ^ 6 : 09'. □ 

We now define normal forms for abstract terms. To simplify the notation, we 
denote a variable sequence xi (B ■ ■ ■ ® Xn as x®, and of course, this is if n = 0. Note 
that the following definition is by structural induction: a normal abstract term for rO 
is defined based on a normal abstract term for some r.;9. The well-foundedness of this 



induction is not obvious but has been stated in [3HK0C, Lemma 4.3]. 



Definition 4.8 For a parameter, the only normal abstract term is 0. 

Now let T = c{u) be a flat type such that «i(r) — {pi,...,pm') and ^{t) = 
{Pm'+i, ■ ■ ■ , Pm), and Q be any type substitution. A normal abstract term for t9 
is or of the form c^(ai (B xf , . . . , Um' ® ^m'^^m'+n ■ ■ • > ^m) © a;®, where for each 
i G {1, . . . , m'}, Gi is a normal abstract term for piQ. 

Note that in a normal abstract term, the "second half" of argument positions, 
i.e., those corresponding to recursive subterm types of r other than r itself, must 



only contain variables. Considering again Def. 4.1, intuitively the abstractions of the 
recursive subterms of t are stored in those positions only temporarily. The ultimate 
goal is to remove any non- variables there using the axioms, in particular the extraction 
axiom. 

Based on the propositions above, one can show: 

Theorem 4.9 For any t with _ h t : 0, a{t) has a representative which is a normal 
abstract term for d>. 



Proof. By Prop. 4.6 and Lemma 4.7 we have _ h a : if a =ac+ 



Assume the notations of Def. |4.3| . The fact that an abstract term is typed according 
to the rules of Table ^ means in particular that if it has the form c?-^(6i, . . . , • • ■ © 
c'^{. ..)©..., bj+i, . . . ,bi) where j E {I' + 1, . . . , I}, i.e., it is not in normal form, 
then c' — c and an extraction axiom is applicable, possibly after several applications 
of associativity and commutativity. 

Likewise, if an abstract term has the form ... d'^ (...)©■•• © d''^{. ..)..., then d — d' 
and the distributivity axiom is applicable. 

Since the distributivity and extraction axioms can only be applied a finite number of 
times, it follows that successive application of them yields an abstract term in normal 
form. □ 



Example 4.4 shows the conversion of two abstract terms to their normal forms. 



We can make some further observations. The first follows from the definition of a. 

Proposition 4.10 For a term t, a{t) contains variables if and only if t contains vari- 
ables. 

The next proposition follows from the fact that by the transparency condition, no 
subterm type of a ground type (p can contain a parameter. 

Proposition 4.11 If is a ground type, then there is exactly one normal abstract 
term for (j) not containing variables and not containing 0. 

Let a be this abstract term. Then any other normal abstract term for not 
containing variables is obtained by replacing some subterms in a with 0. 

The previous three statements tell us that the size of the abstraction of a ground 
term depends only on its type and not on the size of the term itself. However, it would 
be wrong to conclude that all ground terms of the same type have the same abstraction. 
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The problem is due to polymorphism: one term can have several types. For example, 
it is correct to say that both [] and [7] are of type list(int), but «([]) — list-^(0) 
and a([7]) — list'^(int'^). However, we can state: 

Lemma 4.12 If h s : and \- t : cf) and s,t are ground, then both a{s) and a{t) 



are obtained from the unique normal abstract term as mentioned in Prop. 4.11 by 
replacing zero or more subterms with 0. 



5 Relating the Abstraction and the Labels 



Now that we know that each abstract term has a representative in a compact normal 
form, we state a theorem which relates the abstraction of a term to the labels as 
defined in Sec. ||. Actually, it would have been possible to have the following theorem 
as definition of a, and have our present definition as a lemma. This is effectively what 
we did in [ ^HKOO| . The theorem also hnks [ pLO0[ with |LSOl| . Note that a is lifted 
to sets as follows: a{S) := ®teS^(^)- 

Before we can show the theorem, we extend the definition of Z to abstract terms, 
typed as shown in the previous section. We state the following lemma. 

Lemma 5.1 Let and r be flat types such that t6 cxi cf) for some Q. Let t be a 
term such that _ h t : tQQ' for some Q' . Then for any a <SJ cf), or a txi cf), we have 
a{Z{cf), re, cr, t)) = Z(0, tQ, a, a{t)). 

Proof. The proof is by induction on the depth of t. First suppose that t € V. Then 
we have to distinguish whether a = t& or a =/= tO. In the first case, a{Z{cf), rG, cr, t)) = 
t — Z{cf>, t9, a, a{t)). In the second case, a{Z{cf>, t9, a, t)) = $ = Z{(f), tQ, cr, a{t)). 

Now suppose that t is a constant. Again, we have to distinguish whether a = tO or 



a ^ tQ. In the first case, a{Z{cfi, r9, a, t)) = c-^(0, . . 
second case, a(Z(0, r9, cr, t)) = = Z{cf>, t9, a, a{t)). 



Z{cf>,TQ,a,a{t)). In the 



Now consider a term t = fTi...T„^T{ti, . . . , i„) and suppose the result has been proven 
for ti,... ,tn. Suppose «(t) = (pi, . . . ,p„i'} and [x1(t) = {p„i'+i, . . . ,p,„). Consider 
first cr <*] 0. In the following equation sequence, (*) marks steps that use simple 
rearrangements such as lifting a function to sets. 



a{Z{^,TQ,a,t)) 



a y{U 1 nO 


= cr} U 











aiu)e 




















^Z{cf,,n&,a,a{U)) 



(Lem. |3T|) 
(*) 

(ind. hyp.) 

(*) 

(*) 



a{U) \p,e = a\u U Z U, p,9, a, a{U) 



= P3 



Z { <P, r9, cr, a{ti) j = (Lem. p^ ) 

\ Ti—T / 
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(*) 



(Def. 4.1) 



z|^<^,Te,a,0a(t,)^ 
Z U,Te,a,c-^ ( ©0a(to) 

\ \Ti=Pl Ti=p,„ I Ti=T I 

The remaining cases, that either cr [xi and cr 7^ </>, or cr = (/>, are very similar and 
hence omitted. □ 

We can now state the theorem. 

Theorem 5.2 Let r = c{u) be a flat type such that ^(r) = {pi, . . . , pm') and IX1(t) = 
. . . , p,„). For any term t = fr^,,,r„^r{- . . ), we have 



a{t) =AC+ c |a(Z(T,T,pi,i)), . . . , a(2'(T, r, , t)), 

a(Z(r, r, , t)) n V, . . . , a(Z(r, r, p„, t)) n © (Z(t, t, t, i) n V). 

Proof. Suppose the normal form of a{t) is ^^(ai, . . . , a™) © a. 

Consider some j G {1, . . . , m'}. Sin ce a as well as for all m' < k < m, only consist 
of variables, it follows by Prop. 15 and Lemma |3.16 that 



Z {T,T,pj,c-^{ai, . . . ,am) © a) = a^. 
At the same time, by Lemma |5.l| , 

Z {T,T,pj,a{t)) = a{Z {T,T,pj,t)) . 

Now consider some j G {m' + 1, . . . , m}. S ince a as well as for all m' < k < ni, 
only consist of variables, it follows by Prop. 45 and Lemma 3.16| that 



Z {T,T,pj,c-^{ai, . . . ,a„) © a) n V = aj. 



At the same time, by Lemma 5.1 



Z{T,T,pj,a{t)) = a{Z {T,T,pj,t)) . 

Finally consider t. Sin ce a as well as Ofc for all m' < k < m, only consist of variables, 
it follows by Prop. |4.5| and Lemma 3.16| that 



Z (r, T, T, c-^{ai, . . . , Om) © a) n V = a. 
At the same time, by Lemma |5.l| , 

Z (t, t, t, = a(Z (t, t, t, i)) . 
Thus we have shown that the normal form of a{t) is as stated. 



□ 
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Example 5.3 Consider LISTS. We have 

a([[X],[7]]) = list'4(a(Z(list(U),list(U),U,[[X],[7]]))) 
®Z(list(U), list(U), list(U), [[X], [7]]) n V) 
list-4(a({[X], [7]})) © ({[[X], [7]], [[7]], []} n V) 

= lisf^(lisf^(X© int-^)) ©0 

=Ac+ list-^(list-^(X© int-^)). 

The theorem tells us how to read the abstract term. Fhst, the absence of vari- 
able on the highest level (i.e. a([[X], [7]]) is not of the form a; © ...) means that 
^(list(U), list(U), list(U), [[X], [7]]) contains no variables, or, to use the notation 
of | |LS01[ apphed to the grammar m Ex. |3.11| , C{LL,LL, [[X], [7]]) is empty. Likewise, 



the theorem tells us that the argument of the outermost list-^ contains the abstrac- 
tion of all subterms of [[X], [7]] returned by Z(list(U), list(U), U, [[X], [7]]), and again 



in terms of |LS01|, the absence of variables at this level tells us that C,{LL, L, [[X], [7]]) 
is empty. 



6 The Analysis 

In this section, we show how an entire program is abstracted based on an abstraction 
of the fundamental operation, namely unification. The abstract program is then given 
a semantics. This semantics describes^ in a well-defined sense, the semantics of the 
concrete program. Moreover, under reasonable assumptions, it is finitely computable. 



6.1 Abstract Interpretation 

In this subsection, we link our formalism to the standard definitions of abstract inter- 
pretation. 

Our abstract terms may contain variables, and hence it is only natural to define 
substitutions as for concrete terms, only that the range of those substitutions will 
contain abstract terms. The instantiation order <ac+ is defined as follows: a <ac+ b 
if bO-^ =AC+ a for some Q-^. It is hfted to substitutions: 9^ <ac+ if a^i* <AC+ 0^2^ 
for all a. We write a k, b for a <ac-i- b A b <ac+ One should not confuse « with 
=AC+! Our notation follows [CLOO|, not |LS01|. An abstract atom is an atom using 



abstract terms. We denote the set of abstract atoms by B-^. 

In order to define and relate semantics for concrete and abstract programs in the 
framework of abstract interpretation, we consider sets of (abstract) atoms with a 
suitable notion of ordering and equivalence. We consider the lower power domain 
or Hoare domain [ GS9C| |. For sets of abstract atoms /j^ and J^, we define 



It <Ac+ It ^ e It e It ■ At <Ac+ At, 

and It « It if it <Ac+ it it — AC+ it- The elements of [2^^ ]~ are called 
abstract interpretations. Abusing notation, we denote [2'^ ]~ by 2'^ . 

We call a set of abstract atoms /'^ downwards- closed if A-^ £ implies A-^ G /"^ 
for all A-^ 

<Ac+ A-^. The order relation ~ is defined in such a way that each /"^ G 2^ 
is equivalent to a downwards-closed set. This observation implies the following lemma. 



Lemma 6.1 [CLOO, Lemma 3.1] (2^ , <ac+) is a complete lattice. 



Definition 6.2 A Galois insertion is a quadruple {{A, ^A),oi, (B, ^b),j) where 

1- [A, ^a) and (_B, C^) are complete lattices of concrete and abstract domains, 
respectively; 
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2. a : A B and 7 : i? — > A are monotonic functions called abstraction and 
concretisation, respectively; and 

3. a □ 'y{a{a)) and a{j{b)) = b for every a £ A and b E B. 



In the above definition, the a is a priori not the a of Dcf. 4.1, but of course, we 
have used the same letter because we link the two in the natural way. 



Theorem 6.3 [CLOC, Theorem 3.3] Define a and 7 as follows: 



7:2«^ 

Then (2'^,a,2^ ,7) is a Galois insertion 



a{I) = {a(A) I A e /}, 
2«, 7(7-4) ^ u{/ I a(/) <AC+ /-^}. 



Definition 6.4 An abstract term a describes a concrete term t, denoted a oc i, if 
a{t) <AC+ (and likewise for atoms). 

Note that for an atom A and an abstract atom A-^, we have A-^ cx A if and only 
if A £ 7({A-4}). For an interpretation / and an abstract interpretation I^, we define 
/■^ oc / if / C 7(/"4), or equivalently, a{I) <ac+ I'^- 



6.2 Abstract Unification 

In this subsection, we show how abstract unification describes concrete unification. 
First, we can relate abstraction and application of a substitution as follows. 



Lemma 6.5 [ CLOOj , Lemma 4.1] Let t be a term anO a substitution. Then a(t9) = 



AC+ 



a{t){x/a{x6) \ x G dom{9)}. 

Proof. By structural induction on the term. Let 9-^ ~ {x/a{x9) \ x £ dom{9)}. 

If i e V and t ^ dom{9), the result trivially holds. If i £ V and t e dom{9), then 
a{t9) = t{t/a{t9)} = a{t)9^. 

Now consider t = /ri...r„^r (^ij ■ • ■ , ^n), where r = c(u), «(t) — (pi, . . . , p^/), and 
Ix1(t) — {p,n'+i, . . . , Pm), and assume that the result holds for ii, . . . , t„. We have 

a{t9) = I ait,9), . . . , a{t,9) \ © a{t,9) 

\ Tj — p 1 n—Pm / T-i—T 

a{t,)9^, ■ ■ • , a{U)9A © a(t,)^?-^ = o^{t)9^ ■ 



□ 



Example 6.6 By Lemma |6.5| , 

a([X|Y] {X/7,Y/nil}) = 
a([7]) = 

list-4(int-4) =AC+ 

list-4(int-4) © 113^4(0) = 

(list-4(X) © Y) {X/int-4, Y/lisf4(0)} = 

a([X|Y]){X/a(7),Y/a(nil)} 



The following theorem is a straightforward consequence. 
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Theorem 6.7 [lCLO0| , Thm. 4.2] Let ti,t2 be terms. If ti < then a{ti) <ac+ "(^2) 
(and likewise for atoms). 

Definition 6.8 We denote by cC/ac+(oi, 02) a complete set of AC+-unifiers of syn- 
tactic objects oi,02, i.e., a set of abstract substitutions such that for each 6^ G 
cC/ac+(oi, 02), we have Oi9-^ =ac+ 026-^, and for any 6-^ such that 0i9-^ =ac+ 02^'^j 
we have 9-^ <ac+ ^"^ for some 6*-^ € cC/ac+(oi, 02)- 

The next theorem says that unification of abstract terms is a correct abstract unifi- 
cation. The second part of the statement says that an abstract substitution correspond- 
ing to a concrete substitution as stated correctly mimics that concrete substitution, 
for any atom. 



Theorem 6.9 | pLO0| , Thm. 4.4] Let ^1,^2 be atoms that arc unifiablc with MGU 9, 



and A-^, be abstract atoms such that oc Ai and A2 oc ^2- Then there exists a 
unifier 9^ e cU p^c+{Af , Ar^) such that Af9^ cx Ai9, and moreover for any atom S, 
we have a{B)9^ cx B9. 

Proof. Consider the pairs {B,Ai), {B,A2) where B is an arbitrary atom. Since 



{B,Ai)9 < {B,Ai), we have by Thm. 3.7 and the definition of oc 



a{{B, A,)9) <Ac+ A,)) <ac+ HB).Af) [i = 1, 2). 

Since Ai9 = A29, it follows that a{{B,Ai)9) = a{{B,A2)9) and so a{{B,Ai)9) is a 
common AC+-instance of {a{B),A-^) and {a{B),A^). Hence 

cU Ac+i{aiB), At), HB),Af)) 

contains a 9-^ such that a((B,^i)6i) <ac+ {a{B),Af)9-^ and so a(B6') <ac+ a{B)9-^ 
and a(Ai6') <ac+ At9-^. Now cC/Ac+((a(-B), A^^), A^)) is also a complete set 

of unifiers of A^ and ^3*, and so the claim follows. □ 

In addition, we also have that AC-I— unification is optimal. To make this notion 
precise, consider two abstract atoms A^, A-2, and let = lii^f}) (* = 1; 2). For two 
atoms Ai G /i , A2 £12, any common instance is in /i n /2 , since Ii , I2 are downwards- 
closed. Now let clAc+iAt,A^) ^ {At9^ \ 9^ e cU Ac+iA^ , Af )} . We might caU 
c/ac+(^i^) ^2*) ^ complete set of common AC+-instances of Af, A-^. Optimality of 
abstract unification means that c/ac+(^i^, ^'2) describes only the atoms in Ii n l2- 



Theorem 6.10 |CLOO| , Thm. 4.6] For i = 1,2, let Af, be abstract atoms and = 
^{{Af-}). Then c/ac+(^i^, ^2*) describes only the atoms in Ii n l2- 



{{Af^}). Then c/ac+(^i^, ^2*) describes only the atoms in Ii n I2 

Proof. To derive a contradiction, assume that B is an atom such that for some 
B^ e c/ac+(^i^, ^2)^ we have B^ o(. B hut B ^ h D h- This imphes that A^ B 
or A^ 9^ B. On the other hand, B-^ is a common instance of A-^, A^, which implies 
A-^ oc B and A-^ oc B. Contradiction. □ 

6.3 Abstraction of Programs 



In |LS01, SHKOC], programs were assumed to be in normal, also called canonical, 
form. In |LS01|, the abstraction of a unification constraint x = y involves computing 
the intersection of the two grammars corresponding to x and y. In addition, the 
abstraction of a unification constraint y — f{xi , . . . ,Xn) involves computing a grammar 



for f{xi, . . . , Xn) from the n grammars for xi, . . . , x„. In | SHKOCl[ , no such operations 



are performed, but still the abstraction of unification constraints is not obvious. Each 
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unification constraint y — f{xi, . . . , Xn) is abstracted as a call frciiu, xi, . . . , where 
/rei is a predicate that expresses the relationship between the abstraction of a term 
and its subterms. 



In the framework we have set up here following [ CLOC |, abst racting a program is 
much simpler. We have designed the domains so that Thm. x9 holds, and so we can 
abstract a program simply by replacing each term with its abstraction. Thus a is lifted 
in the obvious way to atoms, clauses, programs and queries. 

The semantics of the abstract program will be defined using an AC+-enhanced 
version of the Tp-operator. Formally 

T/(/-^) = {aiH)d^ \C^H^B,,...,B^eP, {A^, . . . , A^) «c 

0^ e cC/Ac+((a(i?i), . . . , a(i?„)), {At, A^))}. 



We denote by |^''^1ac+ least fixpoint of Tp, which exists by |CLOC, Cor. 5.2]. 
The following theorem says that this abstract semantics correctly describes the con- 
crete semantics. 



Theorem 6.11 |CLOO| , Thm. 5.4] Let P be a program. Then la{P)l 



AC+ 



We could make further statements about the semantics, e.g. call and answer pat- 
terns; for that purpose, we would use the magic set transformation. However, those 
results would be completely along the hues of JCL96, CLOO|, as are the proofs of the 
results we gave in this subsection. 



6.4 Finiteness 



In ] CLOC ] we find a result that the abstract semantics of a program is finite provided 
that the type abstraction is monomorphic. The result docs not hold anymore for 
polymorphic type abstractions, and the authors give the program 

A:={P([X])-P(X)., p(l).} 

as an example. As a solution, the authors propose a depth-k abstraction, i.e., some 
ad-hoc bound on the depth of types. 

It is understandable that a descriptive view of types leads to the conviction that 
infinity of the abstract semantics is inherent in a polymorphic type abstraction and 
cannot reasonably be avoided. 

Instead of Pi , consider the program 

P2:={p([X])^p(X)., p([]).} 



For the argument in [ CLOC ], P2 is completely equivalent to Pi, i.e., the authors could 
just as well have chosen P2. However, using P2 allows us to make a stronger point 
than using Pi. The reason is that both programs are not typable by the rules of 
Table |l| (that is to say, in a prescriptive approach to typing), but for P2, this is much 
less obvious. The program P2 is forbidden due to the head condition |HT92| , i.e., the 
special typing rule Head which is different from rule Atom (see Table 



Proposition 6.12 Assuming J^lists (see Ex. 3.13), there exists no variable typing F 
such that F h p([X]) ^ p(X) Clause, regardless of what the declared type of p is. 



Proof. To derive F h p([X]) 
in turn. 



p(X) Clause, we need to derive F h p([X]) Head, and 



Fh[X] 



(9) 
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for some type a. By rule Func and the fact that the declared range type of Cons is 
List(U), it follows that a ~ List(T) for some r, and so for T h p([X]) Head to be a 
valid type judgement, the declared type of p must be List(T), so we write PList(T)- 

To derive F h p([X]) ^ p(X) Clause, we also need to derive F h Piist(r)(X) : Atom, 
and in turn, F h X : list(r)0 for some type substitution Q. This implies that 
X : list(r)6 G F and hence F h [X] : list(list(T))e, and in particular, T \/ [X] : 
list(r). This is a contradiction to (^), showing that there exists no F such that 
F h p([X]) ^ p(X) Clause. □ 

We want to show that disregarding such programs, the abstract semantics is always 
finite. We first need the following lemma about concrete programs, stating that the 
arguments of any atom in [PJ^ are of the declared type. 

Lemma 6.13 Let P be a typed program. For any atom pTi...T„(ii, • • • , Ui) G I-Pl^' 
have _ h (ti, . . . ,t„) : (n, . . . ,r„). 

Proof. Suppose / is a set if atoms having the property stated for |P]]^. We show 
that an application of the Tp-operator to / preserves the property. This immediately 
implies the result. 

Consider some clause C — Pti...t,A^1i ■ • ■ i^n) ^ Bi, . . . ,P,„ in P, and suppose that 
{Ai,...,Ajn) <c I, such that 9 = MGf7((Pi, . . . ,P™), (Ai, . . . ,74„)). By the rules 
in Table ^ in particular Head, we have _ h (ti, . . . , t„) : (ti, . . . , r„). For each Bi — 
(si, . . . , s„/), by the rules in Table |l|, we have _ h (si, . . . , s„') : (cti, . . . , (Jn')Q 
for some Q. Let Ai = q{ri, . . . , Tn')- By assumption about /, we have _ h (ri, . . . , r„') : 
(cTi, . . . , On') and hence, by the typing rules, also _ h {ri, . . . ,rn') ■ (ci, . . . , <7„i)Q. By 



standard results |HT92| , Thm. 1.4.1, Lemma 1.4.2], it follows that _ h (si, . . . ,Sn')0 : 
((Ti, . . . ,(T„')0. Since the choice of Ai was arbitrary, it follows that each atom in C9 
can be typed using the same types as for the corresponding atom in C. This applies 
in particular for the clause head, and so _ h (ti, . . . , tn)0 : (ti, . . . , r„). □ 



By Prop. 4.£ , the above lemma applies also to the abstraction of a program. 



Corollary 6.14 Let P be a typed program. For any atom p^-j^ .T^ (ai, . . . , a„) G 
|a(P)]Ac+, we have _ h- ^ (ai, . . . , a„) : (ri, . . . ,r„). 

The following lemma states that for a given type (f>, there are only finitely many 
different abstract terms for 0. 

Lemma 6.15 For any type 0, the set of abstract terms {a \ _ a : (j)} is finite 
modulo «. 

Proof. Since the claim is that the set is finite modulo «, it is clearly sufficient to 
restrict our attention to normal abstract terms. It is useful to recall the notations and 



definitions of Subsec. 4.2 



The proof is along the lines of the proof of [CLOC, Thm. 3.2], but matters are slightly 



more complicated since our abstract terms are nested. 
We define by structural induction: 

• for each abstract term a ® x® , e is a path, and any variable in x® is a variable 

occurring in a ® x® at e; 

• for an abstract term c^{ai, . . . , a^) . . . , if C is a path for aj and a; is a variable 
occurring in aj at (, then j.C is a path for c"^(ai, . . . , am)®- ■ ■ , and a: is a variable 
occurring in c-^(ai, . . . , am) ® ■ ■ ■ at j.C,. 
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By the well-definedness of normal abstract terms, it follows that that there is a max- 
imum mimber of paths that a normal abstract term for (jj can have. Let n be this 
number for our given (/). 

Suppose that a normal abstract term a for (j) contains more that 2" — 1 distinct variables. 
By a simple combinatorial argument, one sees that there must be at least two variables, 
say X and y, occurring at exactly the same paths in a. Consider a' — a{x/z,y/z}. 
Trivially a' <ac+ On the other hand, we have a'{z/x y} =ac+ a, and thus 
a <AC+ a'- So we have a « a', and ^{vars{a')) = ^{vars{a)) — 1. 

By iterating this argument, it follows that any normal abstract term for is ~- 
equivalent to a term containing no more that 2" — 1 variables, and thus the claim 
follows. □ 

The following is a simple corollary. 

Corollary 6.16 Let Pri...r„ be a predicate and Q a type substitution. Modulo «, 
there are only finitely many abstract atoms p(ai, . . . , a„) such that (ai, . . . , a„) is a 
vector of normal abstract terms for the type vector (ti, . . . , r„)0. 



Theorem 6.17 Let P be a typed program. Then [[q!(P)]ac+ finite. 

Proof. By Corollaries |6.14| and |6.16| . □ 

As it stands, the theorem depends critically on the fact that we assume a bottom-up 
semantics. To explain this, consider the program 

P3 = {p(X)^p([X])., p([]).}, 

which at first look is very similar to the program P2 given at the beginning of this 
subsection. However, assuming that p has declared type list(U), the prog ram P3 is 
typed according to the rules of Table |]. Therefore, of course, Thm. 6.17 applies to 
this program. 

Note that P3, when called with the query p(Y), gives rise to infinitely many calls 
p(Y), p([Y]), p([[Y]]), . . . , with abstractions p(Y), p(list-^(Y)), pClisf^Clisf^CY))), .... 
So the set of calls cannot be described (in the technical sense, using oc) finitely. 

We make two observations about P3: 



The magic set version |CD95| of the program contains the clause 



p=([X])^p=(X)., 

which is to be read as "in order for p([X]) to be called, p(X) must be called. This 
clause is not typable according to the rules of Table ^ as the head condition is 
violated. Thus Thm. 6.17 is not applicable. This is not surprising given the fact 
that the very purpose of the magic set transformation is to characterise calls. 

• In the literature on prescriptive typing, the behaviour exhibited by P3 has been 
called polymorphic recursion |Hen93|. It is by no means common. In fact, it is 
very much on the borderline of what is allowed in prescriptively typed program- 
ming languages. In ML for example, it is forbidden, as it breaks the capabilities 
of the type inference procedure. 

It has previously been suspected that there is an interesting relationship between the 
head condition and polymorphic recursion, which deserves some profound investiga- 
tion | DS01 |. The two observations above add weight to this. 

In this paper, we do not want to study the difference between top-down and bottom- 
up semantics in detail. Nevertheless, we now formulate what it means for a program 
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not to use polymorphic recursion, in which case we say that it only uses mononiorphic 
recursion. 

For simplicity, we assume that a program only contains direct recursion, i.e., no 
recursion of the form p -f— . . . g, g . . .p. It is straightforward to generalise our 
considerations otherwise. In the following, we use t (f) to denote vectors of terms 
(types). 

Definition 6.18 A typed program uses only monomorphic recursion if for any 

clause p{i) ^ . . ■p{s) . . . , we have _ h t : f , s : f for some vector of types f. 

Thus monomorphic recursion means that in each clause, the type of any recursive 
call must be identical to the type of the head. Since the head condition is still in force, 
this implies that any recursive call must have a type which is identical to the declared 
type of its predicate. 

For programs using only monomorphic recursion, it should be possible to devise a 
variant of the magic set transformation such that the transformed program is typed 
according to the rules in Table 0, and therefore has a finite abstract semantics. 



7 Towards an Implementation 



So far, we have not implemented the analysis proposed in this paper. As far as com- 
puting the semantics of the abstract program is concerned, the only difference with 
the implementations mentioned in [ CL96| , CLOC| is that instea d of A CI or ACIl we 
have the equality theory AC+. The former theories are finitary [BS9S|, and the corre- 
sponding unification problems are NP-complete. Obviously, AC-I- cannot behave any 
better. Studying AC-f is a topic for future work, but we would certainly expect it to 
be finitary as well. 

There is an implementation of the analysis we proposed in |3HK00|, which es- 
sentially aims at the same degree of precision we have here, but the framework is 
different, 
relates to 



CD94 



In fact, this paper relates to |SIIKO0| in the same way as [CL96, CLOO| 
This is interesting because the authors mention that an implemen- 



tation using ACI-unification turned out to be much faster than the implementation 
in ]CD94| . 



Note that to compute the abstraction of a program, in |CD94, CLOO , it is the 



user who has to provide information about the particular type language used in a 



program (see paragraph after Def. 4.1), whereas in our analysis, this information is 
extracted from the declared types. We had previously shown |SHKO0| that analysing 
the type declarations (computing the NRSs and recursive types) is viable even for 
some contrived, complex type declarations, which one would never expect in practice, 
since good programs have small types \ Hen93| p| 

In |CL96|, there is some speculation as to why abstract unification is not as bad as 



it seems by the theoretical result that it is NP-complete. It is said that usually uni- 
fications involve "few" variables. Here we want to substantiate that claim somewhat, 
but it remains speculation all the same. 

• The first argument is that abstract terms (in normal form) are likely to be 
linear. Recall that the abstract terms are designed in such a way that different 
positions correspond to different subterm types. Since we use prescriptive types, 
one is tempted to conclude that abstract terms must be linear, since the same 
variable cannot have different types. That however would be a fallacy, since via 



^However, this should not be understood as a contradiction to our NESTS or TABLES examples. 
Henglein comes from a functional programming background, and in that community, those type 
declarations would by all means still qualify as "small" . 
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instantiation of types, different subterm types can become equal. For example, 
the term node(null, X, X, eq, null) of type table(string) has the abstraction 
table-^(X, bal-^, X) which is not linear. Nevertheless, this should be an exception. 

On the whole, logic programs are based on a very simple notion of modes. Of 
course, it is the very exceptions to this rule that justify devel oping a complex in- 
stantiation analysis like the one of this paper or | ]CLOC , LSOl , but still, often one 



deals with simple assignments rather than full unification in concrete programs, 
and this carries through to abstract programs. 



To give at least one example of the advance of our analysis over |CLOO|, we use 
table(int). Suppose there is a predicate insert/4 whose arguments represent: a 
table t, a key fc, a value v, and a table obtained from t by inserting the node whose key 
is k and whose value is v. From the abstract semantics of the program, it is possible 
to read that a query whose abstraction is 

insert(table-^(int'^,bal-^, str-^), str"^, V2,T), 

i.e., a query to insert an uninstantiated value into a ground table, yields an answer 
whose abstraction is 

insert(table-^(inf^,bal-^, str"^), str-^, V2, table'^(int-^ © V2,bal-^, str-^)), 

i.e., the result is a table whose values may be uninstantiated. 



8 Discussion 



In this paper, we have proposed a formalism for deriving abstract domains from 
the type declarations of a typed program. Effectively, we have recast our previous 
work |SHKO0| using important parts of the formalisms of |CLOO, LSOl]. We now 



compare this paper with those two works under several aspects. 



The type system. Using the terminology introduced in this paper, one can say 
that pLOOj uses a polymorphic type system with the following assumptions: types are 
either monomorphic or unary, and the only subterm types of a unary type c(u) are c{u) 
itself (and c{u) ixi c{u)) and u (and u ^ c(u)). This is the simplest thinkable scenario 
of proper polymorphism; in fact, only lists and trees are covered. Our TABLES or let 



alone the NESTS example are not covered. In contrast, |LS01| assumes regular types 



without polymorphism. Thus there are only finitely many types the analysis has to deal 
with. However, those may be very complex; e.g., one can easily cons truct a gram mar 
that corresponds to the type table(int). So the type systems of [ |CLOC , LSOl are 
not formally comparable, but the type system we assume in this paper is a strict 
generalisation of both. 



Descriptive vs. prescriptive types. Accord ing to the authors' claims, CLOO | 
takes a descriptive view of typi ng, w hereas [LSOl takes a prescriptive view. However, 
we find that the formalism of [CLOOj can very well be adapted to prescriptive typing. 
On the other hand, we find that some aspects of |LS01| belong rather to a descriptive 
view of typing. 



First, the fact that the typing approach of | CLOOj is descriptive rightly accounts 
for the fact that they must consider "ill- typed" terms such as [1|2]. In this paper, all 
terms are "well-typed" , and so are the abstract terms. 



In [LSOl] it is assumed that a unique type (or equivalently, grammar) is associ- 
ated with each program variable. A unification constraint in a program gives rise to 
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operations such as computing the intersection of two types and computing the type 
of a term from the types of its subterms. Such operations can improve the precision 
of an analysis, e.g. if X has the declared type "list with an even number of elements" 
and Y has the declared type "list with a number of elements divisible by 3", then a 
unification X = Y implies that both variables have the type "list with a number of 
elements divisible by 6" . In our opinion, the presence of such operations introduces an 
aspect of type inference into their formalism which is somewhat in contradiction to a 
prescriptive approach to typing. While such type inference may be useful, the authors 
do not give a convincing example for it being so. 



Labellings. Labellings are useful to formalise which aspects of the structure of a 
concrete term we want to capture by our analysis, and so it is natural that we used 
them mainly in Sec. ^. I n |CLOO|, they are absent, although they may have been useful 
(see Sec. I). In ^HKOOj , there were similar functions called extractors and termination 
functions. 

First note that C, only collects variables, whereas Z also collects non- variable terms. 
This generalisation allows us to describe the relation between a term and its abstraction 
as we did in Sec. 0. 



The labelling function in | LS01 has three arguments: a grammar (which however 
can be identified with its starting non-terminal), a non-terminal to be labelled, and a 
labelling term. Our labelling function Z has four arguments. We found it useful to 
have as first argument a flat type (e.g. nest(V)) which gives us a certain grammar, 
but also allow for productions of that grammar starting from some other non-terminal 
(e.g. list(nest(V))). Actually, it may well be the case that the first argument is redun- 
dant, i.e. that the grammar can be derived from the starting non-terminal (e.g. nest(V) 
from list(nest(V))). We prefer however to keep this useful intuitive explanation: one 
argument to indicate the grammar, one to indicate the starting non-terminal. The 
difference between our labelling function and that of [LSOl] is due to polymorphism. 



Abstract terms. In [ LS01 |, the abstraction of terms is not actually made explicit, 
but effectively, given a program variable x, its abstraction is the (somehow ordered) 
tuple of non- terminals of the grammar of x. Non-terminals are thought of as abstract 
variables (see the paragraph after Equation (0)). Our abstraction of terms, denoted 
a, is designed in such a way that the abstraction type in | CLOO is essentially a special 
case of it. We do not introduce abstract variables but rather collect the labellings for 
a term in a structure called abstract term. The reason for this decision is that it allows 
us to deal with arbitrarily large grammars/type graphs. 



Type hierarchies. Given a function /..._>c(«), the abstraction type in |CLOC] distin- 
guishes between the argument positions of declared type u and the "recursive" argu- 
ment positions. Via type instantiation, this already gives rise to a certain hierarchy of 
arbitrary depth, as reflected for example in the abstract term list-^(list-^(int-^)). 
Our concepts of non-recursive suhterm type and recursive type generalise this idea. An 
NRS of a flat type is not necessarily a parameter, and t can have other recursive types 
than r itself. This hierarchy allows us to deal with the fact that through instantiation 
of types, a polymor phic l anguage gives rise to arbitrarily big type graphs (grammars). 

In contrast, in jLSOll , all non-terminals (types) reachable from the starting node 
of a grammar are treated in the same way. This approach is viable since the size of 
the grammars is fixed beforehand. 



Equality theory. The equality theory for evaluating abstract terms in |LS01| is 
ACIl. Distributivity is not applicable. In ]CLOC |, the equality theory is ACI, so there 
is no neutral element. The authors mention distributivity but decide against it. This 
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is in contrast to |CL96| where the equahty theory is ACIl plus distributivity. Our 
extraction axioms are only relevant for a language w here s o me ty pe has a recursive 
type other than itself, and so it is not applicable to [ CL96 , CLOC | . We believe that 
at least conceptually, both a neutral element and the distributivity axioms are very 
natural, even if at the level of an implementation, it might be preferable not to have 
them. 



Types — abstract terms? In |CLOO|, there is no distinction between a type con- 
structor c and the function to build abstract terms. Also, the equivalent of our 
function a is called type abstraction and denoted by type, which highlights the fact 
that in descriptive approaches to typing, type analysis and mode analysis are blurred 
almost to the extent of being considered to be the same thing. However, such an iden- 
tification only works because the assumptions about the type system are so restrictive. 



Thus we have generalised |CLOO LS01| by considering a type system which almost 
(see below) corresponds to the type system of existing typed programming languages. 
We have given several examples in Sec. || hoping to convince the reader that such 
a generalisation is non-trivial. In particular, there are two natural requirements: the 
construction of an abstract domain for a polymorphic type should be truly parametric, 
and the abstract domains should be finite for a given program and query. We had to 
impose two conditions on the type declarations to ensure these requirements. On a 
technical level, the fact that the SCCs of a type graph are not stable under instantiation 
makes it difficult to meet those requirements. 

We now very briefly recall some other related work. We refer to the discussions 
in jCLOOj |LS01| , |SHKO(| for more det ails. 

Both this paper a nd [CLOC| , LS01 | build on ideas presented originally in CD94 ]. 

Recursive modes [ rL97 | characterise that the left spine, right spine, or both, of a 
term are instantiated. This seems ad-hoc but often coincides with characterising that 
all recursive subterms of a term are instantiated. 



A system for type analysis of Prolog is presented in | VCL93 |. It takes a descriptive 
approach to typing, and the abstract domains are, in general, infinite. Therefore, 
widenings must be used. Similarly, in JB92 , the finiteness of abstract domains and 



terms is ensured by imposing an ad-hoc bound on the number of symbols. 

It would probably be possible to express the abstraction of terms proposed here as 
application of a particular pre-interpretation |GBS95|. 



A classical instantiation analysis is not interesting for Mercury [3110961 as the 
language is strongly moded. However, our work might also have applications for 
Mercury. 
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